Part II: Tutorial

Author

Maarten Marsman

Published

August 3, 2025

Advanced Analysis I: Network Comparison (Optional)

In earlier sections, we tested whether an edge was present or absent within a single group.

Now we shift to a new question:

Does a network relation differ between groups?

We test whether an edge is different in strength across two (or more) independent samples. This is a Bayesian parameter comparison, where:

  • \(\mathcal{H}_0\): The edge weight is the same across groups (invariance)
  • \(\mathcal{H}_1\): The edge weight differs across groups

We model this difference as: \[ \sigma_{\text{group}} = \begin{cases} \sigma - \frac{1}{2}\delta & \text{for Group 1 (\texttt{x})}\\ \sigma + \frac{1}{2}\delta & \text{for Group 2 (\texttt{y})} \end{cases} \] This means that \(\mathcal{H}_0\text{:} \delta = 0\) and \(\mathcal{H}_1\text{:} \delta \neq 0\).

This is implemented in the bgms package via bgmCompare().

Note

You can compare two or more groups. The GitHub version of bgms supports flexible group comparisons (and also has improved output formatting):

https://github.com/Bayesian-Graphical-Modelling-Lab/bgms

To install the GitHub version (if needed):

install.packages("remotes")  # if not already installed
remotes::install_github("Bayesian-Graphical-Modelling-Lab/bgms")

Note: This version is only needed if you’re using features not yet available on CRAN.

Quick Example: Comparing Clinical vs Population Samples

# Load example data
load("Boredom.rda")
head(Boredom)

french = Boredom[Boredom[,1] == "fr", -1]
english = Boredom[Boredom[,1] != "fr", -1]

# Run network comparison
fit_compare <- bgmCompare(x = french,
                          y = english,
                          iter = 1e4)  # You may need higher values in real work

Extract and interpret results:

The CRAN version has slightly outdated output, so you’ll need to manually extract and inspect the results yourself:

# Extract posterior inclusion probabilities and differences
pip <- fit_compare$indicator[lower.tri(fit_compare$indicator)]
bf <- pip / (1 - pip)
diff_parameters <- fit_compare$pairwise_difference[lower.tri(fit_compare$pairwise_difference)]

# Extract node names
nodes <- colnames(fit_compare$indicator)

# Generate edge labels from lower triangle positions
edge_labels <- c()
for (i in 2:length(nodes)) {
  for (j in 1:(i - 1)) {
    edge_labels <- c(edge_labels, paste(nodes[i], "-", nodes[j]))
  }
}

# Combine into a labeled data frame
comparison_results <- data.frame(
  edge = edge_labels,
  BF = bf,
  difference = diff_parameters
)

# View the result
print(comparison_results, digits = 2)
                             edge           BF    difference
1          loose_ends - entertain   0.30429112 -1.581856e-02
2         loose_ends - repetitive   0.02322726 -3.983520e-04
3        loose_ends - stimulation   2.60100828  6.314624e-02
4          loose_ends - motivated   0.25849484 -1.301823e-02
5      loose_ends - keep_interest   0.04712042 -1.613027e-03
6         loose_ends - sit_around   0.22085215 -1.274864e-02
7     loose_ends - half_dead_dull   1.69832704  4.953139e-02
8          entertain - repetitive   0.03114044 -5.865819e-04
9         entertain - stimulation   0.28551228  1.418548e-02
10          entertain - motivated   0.02375102  3.515613e-04
11      entertain - keep_interest   0.42247511 -2.422353e-02
12         entertain - sit_around   1.26346763  4.540304e-02
13     entertain - half_dead_dull  19.79002079  9.881347e-02
14       repetitive - stimulation   0.13173382  5.809652e-03
15         repetitive - motivated   0.01926409 -1.240824e-04
16     repetitive - keep_interest   0.38523341 -1.871762e-02
17        repetitive - sit_around   9.79913607  8.015141e-02
18    repetitive - half_dead_dull   0.23426314 -1.004382e-02
19        stimulation - motivated   0.08003024  3.227787e-03
20    stimulation - keep_interest   0.01916021  9.570792e-05
21       stimulation - sit_around          Inf -1.410068e-01
22   stimulation - half_dead_dull   0.02218133  3.131296e-04
23      motivated - keep_interest   0.01988781  1.085881e-04
24         motivated - sit_around   0.02072063  1.160316e-04
25     motivated - half_dead_dull   0.03896104 -1.087054e-03
26     keep_interest - sit_around   0.30565348  1.708048e-02
27 keep_interest - half_dead_dull   0.02658865 -5.114357e-04
28    sit_around - half_dead_dull 104.26315789  1.104887e-01

How to interpret?

  • \(BF_{10} > 10\) → strong evidence that an edge differs between groups
  • \(BF_{10} < 1/10\) → strong evidence that the edge is invariant
  • Everything in between → inconclusive difference

Just like earlier, we distinguish evidence for difference vs absence of evidence for difference.

Visualize differences

We can also visualize the median probability network to inspect how the edges differ between the groups:

library(qgraph)
qgraph(fit_compare$pairwise_difference * (fit_compare$indicator > 0.5),
       labels = colnames(fit_compare$interactions),
       posCol = palette("Okabe-Ito")[4],
       negCol = palette("Okabe-Ito")[7])

  • Positive edges (greenish) indicate that the edge weight is larger in the English version/population and smaller in the French version/population.
  • Negative edges (orange) indicate that the edge weight is larger in the French version/population and smaller in the English version/population.

These are edges with a posterior inclusion probability above 0.5; the median probability model.


Using the GitHub Version with Group Labels

The GitHub version of bgms allows comparing more than two groups using a grouping variable.

Here is an example using the Boredom dataset:

# The data is in the Github package

# Run the group comparison (g = grouping variable)
fit <- bgmCompare(x = Boredom[, -1], g = Boredom[, 1])

# Extract and compute Bayes factors
pip <- fit$posterior_mean_indicator[lower.tri(fit$posterior_mean_indicator)]
bf <- pip / (1 - pip)
diff_parameters <- fit$posterior_mean_pairwise[, 3] - fit$posterior_mean_pairwise[, 2]

# Combine into results table
comparison_results <- data.frame(
  BF = bf,
  difference = diff_parameters
)
#View the results
round(comparison_results, 3)

# Create symmetric difference matrix for plotting
pairwise_difference <- matrix(0, nrow = ncol(fit$posterior_mean_indicator),
                              ncol = ncol(fit$posterior_mean_indicator))
pairwise_difference[lower.tri(pairwise_difference)] <- diff_parameters
pairwise_difference <- pairwise_difference + t(pairwise_difference)
colnames(pairwise_difference) <- colnames(fit$posterior_mean_indicator)
rownames(pairwise_difference) <- rownames(fit$posterior_mean_indicator)

# Visualize
library(qgraph)
qgraph(pairwise_difference * (fit$posterior_mean_indicator > 0.5),
       labels = colnames(pairwise_difference),
       posCol = palette("Okabe-Ito")[4],
       negCol = palette("Okabe-Ito")[7])

Simulation Evidence: Why a Bayesian approach to network comparison?

To motivate the Bayesian approach to network comparison, consider the results from a recent simulation study:

AUC comparison of methods

Each panel shows the Area Under the Curve (AUC) for detecting edge differences between two networks, under varying conditions:

  • Number of variables: 5, 10, or 20 (columns)
  • Proportion of differing edges: 5%, 10%, or 15% (rows)
  • Sample sizes ranging from 150 to 1200 per group (x-axis)

What is AUC?

To evaluate detection performance independently of a specific threshold, we use the Receiver Operating Characteristic (ROC) curve. This curve plots the false positive rate (1 – specificity) on the x-axis against the true positive rate (sensitivity) on the y-axis across a range of threshold values.

The Area Under the Curve (AUC) summarizes the ROC curve into a single value between 0 and 1: - An AUC of 0.5 indicates performance no better than random guessing. - Higher AUC values indicate better separation between true and false positives. - Aggregating across all thresholds, AUC avoids arbitrary cutoffs and provides a robust way to compare methods.

What do we see?

  • The Bayesian MRF methods (bgms(B) and bgms(BB)) outperform both NCT (frequentist) and BGGM (Bayesian but limited to continuous data/GGMs)
  • These advantages are especially pronounced in small samples and low-signal settings, which are common in psychological research
  • For high samples and low number of variables NCT is competitive

A current limitation

While the bgms approach is powerful for detecting edge-level or threshold-level differences between groups, it currently does not support an omnibus test for overall network structure equality.

That means:

  • You can test whether specific relations differ between groups (e.g., edge A–B is stronger in group 1).
  • But you cannot formally test (yet) whether the entire network structure is the same across groups.

Advanced Analysis II: Network Clustering (Optional)

Another advanced feature available in bgms is network clustering, identifying groups of nodes that tend to co-occur or form modular substructures.

This is made possible by using a Stochastic Block Model (SBM) prior, which encourages clustered edge structures.

While we won’t go into detail here, an excellent hands-on walkthrough is available from Nikola Sekulovski:

https://www.nikolasekulovski.com/blog/post2/

It covers:

  • How to use the bgms SBM prior
  • How to visualize the resulting block structure
  • How to interpret the clusters in psychological networks

You can explore this in your own data!